Experiment: AVA11.

This document follows the same structure as the main manuscript, and presents figures to address the first two research questions:

Experiment- and individual-level figures for the third research question (Role of set size 1) are in a separate document.

There are a number of ways of summarizing individuals’ data when modeling it, or presenting its results. Depending on the particular statistic (model fits, predictions etc.) some approaches are more common or sensical than others. In the manuscript we presented the model fitting results in two ways. Whenever we talked about the model comparison, we referred to models fitted to individuals, with fits averaged across individuals. Whenever we talked about models’ prediction on the other hand, we looked at the predictions based on models’ fit to data aggregated across indivuduals in one experiments. We chose this approach to forego inferences being unduly influenced by the noise of predictions on the basis on unstable, fits to sparse individuals’ data. Nonetheless, this is not the only possible approach.

In this supplemental material, we show a variety of approaches for summarizing the data to complement the choices we made for the manuscript. For all three research questions, we will show experiment-level and individual-level plots corresponding to the information provided in the manuscript: Stability of model fits, model comparison by AIC (and crossvalidation for individual-level data), Predicted and observed error distributions, predicted/observed summary statistics and normalized root mean square deviation (NRMSD). In particular for the experiment-level data there are various approaches to aggregating or averaging the data, so we will explain these in more detail in the separate tabs below.

Research question 1: Parameterization of memory precision and response noise

Experiment-level data

By experiment-level data, we mean the data in AVA11 as a collection of the data provided by all 45 participant(s). The manner in which data, model fitting results and predictions of an experiment are sumamrized can vary depending on goal and approaches. For some aspects of the data, some approaches make more sense than others. For completeness’ sake we show a variety here, even if some are somewhat non-sensical.

Fit to aggregate data

Here we treated all data in AVA11 as if it had been provided by a single participant (rather than provided by 45 participant(s)). We fitted all models to this aggregate data. The model comparison shows relative model fits on the basis of the fit to the aggregate data.

In the error distribution and summary statistics, the data is given by the observed data aggregated across individuals. The predictions (error distributions, resultant summary statistics and normalized RMSD relative to the aggregate data) are made on the basis of aggregate fit best-fit parameter estimates for all models.

Averaging individual parameter estimates

In this tab, all graphs are shown for the purpose of illustrating the performance in the experiment on average, i.e., averaged across participants.

The model comparison plot is based on averaging models’ relative fits to the best-fit model. We show two rows of graphs. The top row includes all participants, and the bottom row excludes participants for extreme test/hold-out set deviance in LOSsO-CV for one of the models to allow comparison of model comparison results across approaches (this was done in the model comparison plots in the manuscript). Here in AVA11, we excluded 6 participant(s) (AVA11_E4_19, AVA11_E4_21, AVA11_E4_23, AVA11_E4_40, AVA11_E4_8, AVA11_E4_9); the number of participants in each panel is shown in top-right corner.

The predictions of the behavioral signature pattern (error distribution and resultant summary statistics) are based on averaging the parameter estimates of the best-fit individual fits. The observed data is given by the aggregate data for both graphs.

Averaging individual predictions

In the tab “Averaging individual parameter estimates”, we looked at the prediction of error distributions and resultant summary statistics on the basis of averaging participants’ best-fit parameter estimates. Here, the predictions are based on averaging individual’s predictions (which were based on best-fit parameter estimates). For the summary statistics (across set sizes, and normalized RMSD) we show two graphs: A) deriving the summary statistics from the averaged error distributions (for NRMSDs compared to the summary statistics of the aggregate data), and B) averaging individuals’ summary statistics, and similarly aggregating individuals’ normalized RMSDs.

Plots labeled “A”. The following summary statistics are derived from the averaged error distribution above. The graphs with the normalized NRMSD shows the comparison of these averaged prediction to the aggregate data. This is not particularly useful as the distribution underlying the aggregate data is not necessarily the averaged of the individuals’ data, but it is one approach to summarizing the data.

Plots labeled “B”. The following summary statistics are based on averaging participants’ summary statistics, for both the data (e.g., the summary statistics derived from individual observed error distributions) and the model predictions (e.g., the summary statistics derived from individual observed error distributions). The graphs with the normalized NRMSD shows the normalized RMSD across individuals as boxplots to provide an idea of the spread of the NRMSD in the experiment for these models (i.e., NOT the normalized RMSD derived from contrasting averaged predicted and observed summary statistics.)

Individual-level data

Below are all graphs for all individuals in the experiment. Additionally, we included one graph showing the stability of the AIC model fits for all models by plotting the difference in deviance terms between the best and second-best run of each model for each individual’s data set.

AVA11_E4_1

AVA11_E4_10

AVA11_E4_11

AVA11_E4_12

AVA11_E4_13

AVA11_E4_14

AVA11_E4_15

AVA11_E4_16

AVA11_E4_17

AVA11_E4_18

AVA11_E4_19

AVA11_E4_2

AVA11_E4_20

AVA11_E4_21

AVA11_E4_22

AVA11_E4_23

AVA11_E4_24

AVA11_E4_25

AVA11_E4_26

AVA11_E4_27

AVA11_E4_28

AVA11_E4_29

AVA11_E4_3

AVA11_E4_30

AVA11_E4_31

AVA11_E4_32

AVA11_E4_33

AVA11_E4_34

AVA11_E4_35

AVA11_E4_36

AVA11_E4_37

AVA11_E4_38

AVA11_E4_39

AVA11_E4_4

AVA11_E4_40

AVA11_E4_41

AVA11_E4_42

AVA11_E4_43

AVA11_E4_44

AVA11_E4_45

AVA11_E4_5

AVA11_E4_6

AVA11_E4_7

AVA11_E4_8

AVA11_E4_9

Research Question 2: Response noise and limit to memory capacity

Experiment-level data

By experiment-level data, we mean the data in AVA11 as a collection of the data provided by all 45 participant(s). The manner in which data, model fitting results and predictions of an experiment are summarized can vary depending on goal and approaches.

Fit to aggregate data

Here we treated all data in AVA11 as if it had been provided by a single participant (rather than provided by 45 participant(s)). We fitted all models to this aggregate data. The model comparison shows relative model fits on the basis of the fit to the aggregate data.

In the error distribution and summary statistics, the data is given by the observed data aggregated across individuals. The predictions (error distributions, resultant summary statistics and normalized RMSD relative to the aggregate data) are made on the basis of aggregate fit best-fit parameter estimates for all models.

Averaging individual parameter estimates

In this tab, all graphs are shown for the purpose of illustrating the performance in the experiment on average, i.e., averaged across participants.

The model comparison plot is based on averaging models’ relative fits to the best-fit model. We show two rows of graphs. The top row includes all participants, and the bottom row excludes participants for extreme test/hold-out set deviance in LOSsO-CV for one of the models to allow comparison of model comparison results across approaches (this was done in the model comparison plots in the manuscript). Here in AVA11, we excluded 4 participant(s) (AVA11_E4_19, AVA11_E4_21, AVA11_E4_23, AVA11_E4_17); the number of participants in each panel is shown in top-right corner.

The predictions of the behavioral signature pattern (error distribution and resultant summary statistics) are based on averaging the parameter estimates of the best-fit individual fits. The observed data is given by the aggregate data for both graphs.

Averaging individual predictions

In the tab “Averaging individual parameter estimates”, we looked at the prediction of error distributions and resultant summary statistics on the basis of averaging participants’ best-fit parameter estimates. Here, the predictions are based on averaging individual’s predictions (which were based on best-fit parameter estimates). For the summary statistics (across set sizes, and normalized RMSD) we show two graphs: A) deriving the summary statistics from the averaged error distributions (for NRMSDs compared to the summary statistics of the aggregate data), and B) averaging individuals’ summary statistics, and similarly aggregating individuals’ normalized RMSDs.

Plots labeled “A”. The following summary statistics are derived from the averaged error distribution above. The graphs with the normalized NRMSD shows the comparison of these averaged prediction to the aggregate data. This is not particularly useful as the distribution underlying the aggregate data is not necessarily the averaged of the individuals’ data, but it is one approach to summarizing the data.

Plots labeled “B”. The following summary statistics are based on averaging participants’ summary statistics, for both the data (e.g., the summary statistics derived from individual observed error distributions) and the model predictions (e.g., the summary statistics derived from individual observed error distributions). The graphs with the normalized NRMSD shows the normalized RMSD across individuals as boxplots to provide an idea of the spread of the NRMSD in the experiment for these models (i.e., NOT the normalized RMSD derived from contrasting averaged predicted and observed summary statistics.)

Individual-level data

Below are all graphs for all individuals in the experiment. Additionally, we included one graph showing the stability of the AIC model fits for all models by plotting the difference in deviance terms between the best and second-best run of each model for each individual’s data set.

AVA11_E4_1

AVA11_E4_10

AVA11_E4_11

AVA11_E4_12

AVA11_E4_13

AVA11_E4_14

AVA11_E4_15

AVA11_E4_16

AVA11_E4_17

AVA11_E4_18

AVA11_E4_19

AVA11_E4_2

AVA11_E4_20

AVA11_E4_21

AVA11_E4_22

AVA11_E4_23

AVA11_E4_24

AVA11_E4_25

AVA11_E4_26

AVA11_E4_27

AVA11_E4_28

AVA11_E4_29

AVA11_E4_3

AVA11_E4_30

AVA11_E4_31

AVA11_E4_32

AVA11_E4_33

AVA11_E4_34

AVA11_E4_35

AVA11_E4_36

AVA11_E4_37

AVA11_E4_38

AVA11_E4_39

AVA11_E4_4

AVA11_E4_40

AVA11_E4_41

AVA11_E4_42

AVA11_E4_43

AVA11_E4_44

AVA11_E4_45

AVA11_E4_5

AVA11_E4_6

AVA11_E4_7

AVA11_E4_8

AVA11_E4_9